PrivBasis: Frequent Itemset Mining with Differential Privacy

نویسندگان

  • Ninghui Li
  • Wahbeh H. Qardaji
  • Dong Su
  • Jianneng Cao
چکیده

The discovery of frequent itemsets can serve valuable economic and research purposes. Releasing discovered frequent itemsets, however, presents privacy challenges. In this paper, we study the problem of how to perform frequent itemset mining on transaction databases while satisfying differential privacy. We propose an approach, called PrivBasis, which leverages a novel notion called basis sets. A θ-basis set has the property that any itemset with frequency higher than θ is a subset of some basis. We introduce algorithms for privately constructing a basis set and then using it to find the most frequent itemsets. Experiments show that our approach greatly outperforms the current state of the art.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Study of Differentially Private Frequent Itemset Mining

Frequent sets play an important role in many Data Mining tasks that try to search interesting patterns from databases, such as association rules, sequences, correlations, episodes, classifiers and clusters. FrequentItemsets Mining (FIM) is the most well-known techniques to extract knowledge from dataset. In this paper differential privacy aims to get means to increase the accuracy of queries fr...

متن کامل

Personalized Privacy-Preserving Frequent Itemset Mining Using Randomized Response

Frequent itemset mining is the important first step of association rule mining, which discovers interesting patterns from the massive data. There are increasing concerns about the privacy problem in the frequent itemset mining. Some works have been proposed to handle this kind of problem. In this paper, we introduce a personalized privacy problem, in which different attributes may need differen...

متن کامل

Privacy-Preserving Frequent Itemset Mining for Sparse and Dense Data

Frequent itemset mining is a task that can in turn be used for other purposes such as associative rule mining. One problem is that the data may be sensitive, and its owner may refuse to give it for analysis in plaintext. There exist many privacy-preserving solutions for frequent itemset mining, but in any case enhancing the privacy inevitably spoils the efficiency. Leaking some less sensitive i...

متن کامل

Mining Frequent Patterns with Differential Privacy

The mining of frequent patterns is a fundamental component in many data mining tasks. A considerable amount of research on this problem has led to a wide series of efficient and scalable algorithms for mining frequent patterns. However, releasing these patterns is posing concerns on the privacy of the users participating in the data. Indeed the information from the patterns can be linked with a...

متن کامل

Privacy Protection Using Sensitive Data Protection Algorithm In Frequent Itemset Mining Of Medical Datasets

Frequent Itemset Mining (FIM) is one of the most eminent techniques in the Data mining systems. The exploration of Frequent Itemset Mining distills the recurring knowledge from the incessant data. Explosion of Frequent Itemset Mining in the field of Data Analysis and Data Mining becomes an inescapable one. The paper focuses on “searching the accurate records of efficient database queries withou...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • PVLDB

دوره 5  شماره 

صفحات  -

تاریخ انتشار 2012